Activity: Describe the Run-time Architecture

Purpose To analyze concurrency requirements, to identify processes, identify inter-process communication mechanisms, allocate inter-process coordination resources, identify process lifecycles, and distribute model elements among processes.
Steps Analyze Concurrency Requirements Identify Processes and Threads Identify Process Lifecycles Identify Inter-Process Communication Mechanisms Allocate Inter-Process Coordination Resources Map Processes onto the Implementation Environment Distribute Model Elements Among Processes
Input Artifacts: Supplementary Specifications Capsules Design Model	Resulting Artifacts: Process View of the Software Architecture Document
Worker: Architect
More Information: Concepts: Concurrency Checkpoints: Process View Guidelines: Concurrency
Tool Mentor: Documenting the Process View

Workflow Details:

Core Workflow: Analysis & Design

Refine the Architecture

Artifact: Capsules are used as the primary mechanism to represent concurrent threads of execution in the system. Capsules represent lightweight threads of control which may or may not map to threads or processes provided by the operation system environment. Several capsules may execute within a single thread, or if greater responsiveness to external events is needed, there may be one capsule per thread. This activity uses the basic capsule structure of the system to define a process architecture for the system. Early in the Elaboration phase this architecture will be quite preliminary, but by late Elaboration the processes and threads should be well-defined.

Analyze Concurrency Requirements

Purpose

To define the extent to which parallel execution of tasks is required for the system. This definition will help shape the architecture.

During Activity: Identify Design Elements, concurrency requirements driven primarily by naturally occurring demands for concurrency in the problem domain were considered. The result of this was a set of Artifact: Capsules, representing logical threads of control in the system. In this step, we consider other sources of concurrency requirements - those imposed by the non-functional requirements of the system.

Concurrency requirements are driven by:

The degree to which the system must be distributed. A system whose behavior must be distributed across processors or nodes virtually requires a multi-process architecture. A system which uses some sort of Database Management System or Transaction Manager also must consider the processes which those major subsystems introduce.
The computation intensity of key algorithms. In order to provide good response times, it may be necessary to place computationally intensive activities in a process or thread of their own so that the system is still able to respond to user inputs while computation takes place, albeit with fewer resources.
The degree of parallel execution supported by the environment. If the operating system or environment does not support threads (lightweight processes) there is little point in considering their impact on the system architecture.
The need for fault tolerance in the system. Backup processors require backup process, and drive the need to keep primary and backup processes synchronized.
The arrival pattern of events in the system. In systems with external devices or sensors, the arrival patterns of incoming events may differ from sensor to sensor. Some events may be periodic (i.e. occur at a fixed interval, plus or minus a small amount) or aperiodic (i.e. with an irregular interval). Capsules representing devices which generate different event patterns will usually be assigned to different operating system threads, with different scheduling algorithms, to ensure that events or processing deadlines are not missed (if this is a requirement of the system).

As with many architectural problems, these requirements may be somewhat mutually exclusive. It is not uncommon to have, at least initially, conflicting requirements. Ranking requirements in terms of importance will help resolve the conflict.

Identify Processes and Threads

Purpose

To define the processes and threads which will exist in the system.

Concepts: Concurrency

The simplest approach is to allocate all capsules to a common thread or process since this minimizes context-switching overhead. However, in certain cases - and this will include many, if not most, real-time systems - it may be necessary to distribute the capsules across one or more threads or processes.

If a capsule makes a synchronous call to some other process or thread, this will automatically suspend all other capsules located in the invoking process. (Note that a synchronous invocation between capsules in the same process or thread is equivalent to a procedure call.) This leads us to the conclusion that capsules should be grouped into processes or threads based on their need to run concurrently with synchronous invocations. That is, the only time a capsule should be packaged in the same process or a thread with another object that uses synchronous invocations is if it does not need to execute concurrently with that object. In the extreme case, this can lead to a separate thread or process for each capsule.

As a general rule, in the above situations it is better to use lightweight threads instead of full-fledged processes since that involves less overhead. However, we may still want to take advantage of some of the special characteristics of processes in certain special cases. Since threads share the same address space, they are inherently more risky than processes. If the possibility of accidental overwrites is a concern, then processes are preferred. Furthermore, since processes represent independent units of recovery in most operating systems, it may be useful to allocate active objects to processes based on their need to recover independently of each other. That is, all capsules that need to be recovered as a unit might be packaged together in the same process.

For each separate flow of control needed by the system, create a process or a thread (lightweight process). A thread should be used in cases where there is a need for nested flow of control (i.e. within a process, there is a need for independent flow of control at the sub-task level).

For example, we can say (not necessarily in order of importance) that separate threads of control may be needed to:

Separate concerns between different areas of the software
Take advantage of multiple CPUs in a node or multiple nodes in a distributed system
Increase CPU utilization by allocating cycles to other activities when a thread of control is suspended
Prioritize activities
Support load sharing across several processes and processors
Achieve a higher system availability by having backup processes
Support the DBMS, Transaction Manager, or other major subsystems.

Example

In the Automated Teller Machine, asynchronous events must be handled coming from three different sources: the user of the system, the ATM devices (in the case of a jam in the cash dispenser, for example), or the ATM Network (in the case of a shutdown directive from the network). To handle these asynchronous events, we can define three separate threads of execution within the ATM itself, as shown below using active classes in UML.

Processes and Threads within the ATM

Identify Process Lifecycles

Purpose

To identify when processes and threads are created and destroyed.

Each process or thread of control must be created and destroyed. In a single-process architecture, process creation occurs when the application is started and process destruction occurs when the application ends. In multi-process architectures, new processes (or threads) are typically spawned or forked from the initial process created by the operating system when the application is started. These processes must be explicitly destroyed as well.

The sequence of events leading up to process creation and destruction must be determined and documented, as well as the mechanism for creation and deletion.

Example

In the Automated Teller Machine, one main process is started which is responsible for coordinating the behavior of the entire system. It in turn spawns a number of subordinate threads of control to monitor various parts of the system: the devices in the system, and events emanating from the customer and from the ATM Network. The creation of these processes and threads can be shown with active classes in UML, and the creation of instances of these active classes can be shown in a sequence diagram, as shown below:

Creation of processes and threads during system start-up

Identify Inter-Process Communication Mechanisms

Purpose

To identify the means by which processes and threads will communicate.

Inter-process communication (IPC) mechanisms enable messages to be sent between objects executing in separate processes.

Typical inter-process communications mechanisms include:

Shared memory, with or without semaphores to ensure synchronization.
Rendezvous, especially when directly supported by a language such as Ada
Semaphores, used to block simultaneous access to shared resources
Message passing, both point-to-point and point-to-multipoint
Mailboxes
RPC - Remote procedure calls
Event Broadcast - using a "software bus" ("message bus architecture")

The choice of IPC mechanism will change the way the system is modeled; in a "message bus architecture", for example, there is no need for explicit associations between objects to send messages.

Allocate Inter-Process Coordination Resources

Purpose

To allocate scarce resources
To anticipate and manage potential performance bottlenecks

Inter-process communication mechanisms are typically scarce. Semaphores, shared memory, and mailboxes are typically fixed in size or number and cannot be increased without significant cost. RPC, messages and event broadcasts soak up increasingly scarce network bandwidth. When the system exceeds a resource threshold, it typically experiences non-linear performance degradation: once a scarce resource is used up, subsequent requests for it are likely to have an unpleasant effect.

If scarce resources are over-subscribed, there are several strategies to consider:

reducing the need for the scarce resource by reducing the number of processes
changing the usage of scarce resources (for one or more processes, choose a different, less scarce resource to use for the IPC mechanism)
increasing the quantity of the scarce resource (e.g. increasing the number of semaphores). This can be done for relatively small changes, but often has side effects or fixed limits.
sharing the scarce resource (e.g. only allocating the resource when it is needed, then letting go when done with it). This is expensive and may only forestall the resource crisis.

Regardless what the strategy chosen, the system should degrade gracefully (rather than crashing), and should provide adequate feedback to a system administrator to allow the problem to be resolved (if possible) in the field once the system is deployed.

If the system requires special configuration of the run-time environment in order to increase the availability of a critical resource (often control by re-configuring the operating system kernel), the system installation needs to either do this automatically, or instruct a system administrator to do this before the system can become operational. For example, the system may need to be re-booted before the change will take effect.

Map Processes onto the Implementation Environment

Purpose

To map the "flows of control" onto the concepts supported by the implementation environment.

Conceptual processes must be mapped onto specific constructs in the operating environment. In many environments, there are choices of types of process, at the very least usually process and threads. The choices will be base on the degree of coupling (processes are stand-alone, whereas threads run in the context of an enclosing process) and the performance requirements of the system (inter-process communication between threads is generally faster and more efficient than that between processes).

In many systems, there may be a maximum number of threads per process or processes per node. These limits may not be absolute, but may be practical limits imposed by the availability of scarce resources. The threads and processes already running on a target node need to be considered along with the threads and processes proposed in the process architecture. The results of the earlier step, Allocate Inter-Process Coordination Resources, need to be considered when the mapping is done to make sure that a new performance problem is not being created.

Distribute Model Elements Among Processes

Purpose

To determine which processes classes and subsystems should execute within.

Instances of a given class or subsystem must execute within at least one process; they may in fact execute in several different processes. The process provides an "execution environment for the class or subsystem.

Using two different strategies simultaneously, we determine the "right" amount of concurrency and define the "right" set of processes:

Inside-out

Starting from the Design Model, group classes and subsystems together in sets of cooperating elements that (a) closely cooperate with one another and (b) need to execute in the same thread of control. Consider the impact of introducing inter-process communication into the middle of a message sequence before separating elements into separate threads of control.
Conversely, separate classes and subsystems which do not interact at all, placing them in separate threads of control.
This clustering proceeds until the number of processes has been reduced to the smallest number that still allows distribution and use of the physical resources.

Outside-in

Identify external stimuli to which the system must respond. Define a separate thread of control to handle each stimuli and a separate server thread of control to provide each service.
Consider the data integrity and serialization constraints to reduce this initial set of threads of control to the number that can be supported by the execution environment.

This is not a linear, deterministic process leading to an optimal process view; it requires a few iterations to reach an acceptable compromise.

Example

The following diagram illustrates how classes within the ATM are distributed among the processes and threads in the system.

Mapping of classes onto processes for the ATM

Rational Unified Process